{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "Chapter 9: Hypothesis testing.ipynb",
      "provenance": [],
      "collapsed_sections": []
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "accelerator": "GPU"
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vCYOhQMVnuQw",
        "colab_type": "text"
      },
      "source": [
        "In a study about mental health in Youth, some studies revealed 48% of parents believed that social media was the cause of their teenager's stress. \n",
        "\n",
        "**Population**: Parent with a teenager (age >= 18)\n",
        "\n",
        "**Parameter of Interest**: p\n",
        "\n",
        "**Null Hypothesis**: p = 0.48\n",
        "\n",
        "**Aternative Hypothesis**: p > 0.48\n",
        "\n",
        "**Data**: 4500 people were surveyed and 65% of those who were surveyed believe that their teenagers' stress is caused due to social media. "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "mIdfXTIIeoaJ",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "n = 4500\n",
        "pnull= 0.48\n",
        "phat = 0.65"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "niS8pgXQo4Bh",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "import statsmodels.api as sm\n",
        "import numpy as np\n",
        "import matplotlib.pyplot as plt\n",
        "import pandas as pd"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "JMPF7o4yo5zi",
        "colab_type": "code",
        "outputId": "28c418a1-0ae2-4fa6-96f7-61bde9c50b27",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        }
      },
      "source": [
        "sm.stats.proportions_ztest(phat * n, n, pnull, alternative='larger')"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(23.90916877786327, 1.2294951052777303e-126)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 5
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "GOr0WiKmpO56",
        "colab_type": "text"
      },
      "source": [
        "Our calculated p-value is 1.2294951052777303e-126 is pretty small and we can reject the Null Hypothesis (H0). "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "LPBvaBk7r6FG",
        "colab_type": "code",
        "outputId": "d1ab53db-bf13-4147-f2d0-7b3e8935a20e",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        }
      },
      "source": [
        "import numpy as np\n",
        "\n",
        "sdata = np.random.randint(200, 250, 89)\n",
        "sm.stats.ztest(sdata, value = 80, alternative = \"larger\")"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(96.71588016677123, 0.0)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 6
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "2Mw7RNKTzc41",
        "colab_type": "code",
        "outputId": "c3a32cb0-1d6e-4e40-f05a-92f04919c2b5",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        }
      },
      "source": [
        "sm.stats.ztest(sdata, value = 80, alternative = \"larger\")"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(96.71588016677123, 0.0)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 7
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7WkKtI6gnQma",
        "colab_type": "text"
      },
      "source": [
        "# T-test\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "sNUM1QA3nT6o",
        "colab_type": "code",
        "outputId": "7650e187-c338-4d41-d18e-bd8990b913e5",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 102
        }
      },
      "source": [
        "height = np.array([172, 184, 174, 168, 174, 183, 173, 173, 184, 179, 171, 173, 181, 183, 172, 178, 170, 182, 181, 172, 175, 170, 168, 178, 170, 181, 180, 173, 183, 180, 177, 181, 171, 173, 171, 182, 180, 170, 172, 175, 178, 174, 184, 177, 181, 180, 178, 179, 175, 170, 182, 176, 183, 179, 177])\n",
        "height"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "array([172, 184, 174, 168, 174, 183, 173, 173, 184, 179, 171, 173, 181,\n",
              "       183, 172, 178, 170, 182, 181, 172, 175, 170, 168, 178, 170, 181,\n",
              "       180, 173, 183, 180, 177, 181, 171, 173, 171, 182, 180, 170, 172,\n",
              "       175, 178, 174, 184, 177, 181, 180, 178, 179, 175, 170, 182, 176,\n",
              "       183, 179, 177])"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 32
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "py3Tg4B3nlkP",
        "colab_type": "code",
        "outputId": "a1329f07-e2f7-429d-834a-b117d6b5e0bf",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 68
        }
      },
      "source": [
        "from scipy.stats import ttest_1samp\n",
        "import numpy as np\n",
        "\n",
        "\n",
        "height_average = np.mean(height)\n",
        "print(\"Average height is = {0:.3f}\".format(height_average))\n",
        "\n",
        "tset,pval = ttest_1samp(height, 175)\n",
        "\n",
        "print(\"P-value = {}\".format(pval))\n",
        "\n",
        "if pval < 0.05:\n",
        "  print(\"We are rejecting the null Hypotheis.\")\n",
        "else:\n",
        "  print(\"We are accepting the null hypothesis\")\n"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Average height is = 175.618\n",
            "P-value = 0.35408130524750125\n",
            "We are accepting the null hypothesis\n"
          ],
          "name": "stdout"
        }
      ]
    }
  ]
}